Active Learning from Crowds with Unsure Option
نویسندگان
چکیده
Learning from crowds, where the labels of data instances are collected using a crowdsourcing way, has attracted much attention during the past few years. In contrast to a typical crowdsourcing setting where all data instances are assigned to annotators for labeling, active learning from crowds actively selects a subset of data instances and assigns them to the annotators, thereby reducing the cost of labeling. This paper goes a step further. Rather than assume all annotators must provide labels, we allow the annotators to express that they are unsure about the assigned data instances. By adding the “unsure” option, the workloads for the annotators are somewhat reduced, because saying “unsure” will be easier than trying to provide a crisp label for some difficult data instances. Moreover, it is safer to use “unsure” feedback than to use labels from reluctant annotators because the latter has more chance to be misleading. Furthermore, different annotators may experience difficulty in different data instances, and thus the unsure option provides a valuable ingredient for modeling crowds’ expertise. We propose the ALCU-SVM algorithm for this new learning problem. Experimental studies on simulated and real crowdsourcing data show that, by exploiting the unsure option, ALCU-SVM achieves very promising performance.
منابع مشابه
Multi-Label Active Learning from Crowds
Multi-label active learning is a hot topic in reducing the label cost by optimally choosing the most valuable instance to query its label from an oracle. In this paper, we consider the poolbased multi-label active learning under the crowdsourcing setting, where during the active query process, instead of resorting to a high cost oracle for the ground-truth, multiple low cost imperfect annotator...
متن کاملRiding the Active Learning Wave: Using Problem-Based Learning as a Catalyst for Creating Faculty-Librarian Partnerships
With higher education shifting its emphasis from teaching to learning and inputs to outcomes, active learning techniques are gaining prominence. Research has shown that students learn better when they actively engage the course content, rather than passively absorb lecture material. However, many faculty are unsure of how to take advantage of these new techniques to improve the learning outcome...
متن کاملActive Learning from Crowds
Obtaining labels can be expensive or timeconsuming, but unlabeled data is often abundant and easier to obtain. Most learning tasks can be made more efficient, in terms of labeling cost, by intelligently choosing specific unlabeled instances to be labeled by an oracle. The general problem of optimally choosing these instances is known as active learning. As it is usually set in the context of su...
متن کاملMinimizing Queries for Active Labeling with Sequential Analysis
When building datasets for supervised machine learning problems, data is often labelled manually by human annotators. In domains like medical imaging, acquiring labels can be prohibitively expensive. Both active learning and crowdsourcing have emerged as ways to frugally label datasets. In active learning, there has been recent interest in algorithms that exploit the data’s structure to direct ...
متن کاملLearning through inquiry: student difficulties with online course-based Material
This study investigates the case-based learning experience of 133 undergraduate veterinarian science students. Using qualitative methodologies from relational Student Learning Research, variation in the quality of the learning experience was identified, ranging from coherent, deep, quality experiences of the cases, to experiences that separated significant aspects, such as the online case histo...
متن کامل